<!DOCTYPE html>
<html >
<head>
<meta http-equiv="Content-Type" content="text/html; charset=US-ASCII">
<meta name="generator" content="hevea 2.32">
<link rel="stylesheet" type="text/css" href="omniORBpy.css">
<title>Code set conversion</title>
</head>
<body >
<a href="omniORBpy007.html"><img src="previous_motif.svg" alt="Previous"></a>
<a href="index.html"><img src="contents_motif.svg" alt="Up"></a>
<a href="omniORBpy009.html"><img src="next_motif.svg" alt="Next"></a>
<hr>
<h1 id="sec113" class="chapter">Chapter&#XA0;8&#XA0;&#XA0;Code set conversion</h1>
<p>
<a id="chap:codesets"></a></p><p>omniORB supports full code set negotiation, used to select and
translate between different character sets, for the transmission of
chars, strings, wchars and wstrings. The support is mostly transparent
to application code, but there are a number of options that can be
selected. This chapter covers the options, and also gives some
pointers about how to implement your own code sets, in case the ones
that come with omniORB are not sufficient.</p>
<h2 id="sec114" class="section">8.1&#XA0;&#XA0;Native code set</h2>
<p>For the ORB to know how to handle strings given to it by the
application, it must know what code set they are represented with, so
it can properly translate them if need be.</p><p>For Python 2.x, the default is ISO 8859-1 (Latin 1). A different code
set can be chosen at initialisation time with the
<span style="font-family:monospace">nativeCharCodeSet</span> parameter. The supported code sets are
printed out at initialisation time if the ORB traceLevel is 15 or
greater. Some applications may need to set the native char code set to
UTF-8, allowing the full Unicode range to be supported in strings.</p><p>In Python 3.x, all Python strings are Unicode, so it always behaves as
if the native char code set is UTF-8.</p><p>wchar and wstring are always represented by the Python Unicode type,
so there is no need to select a native code set for wchar.</p>
<h2 id="sec115" class="section">8.2&#XA0;&#XA0;Default code sets</h2>
<p>The way code set conversion is meant to work in CORBA communication is
that each client and server has a <span style="font-style:italic">native</span> code set that it uses
for character data in application code, and supports a number of
<span style="font-style:italic">transmission</span> code sets that is uses for communication. When a
client connects to a server, the client picks one of the server&#X2019;s
transmission code sets to use for the interaction. For that to work,
the client plainly has to know the server&#X2019;s supported transmission
code sets.</p><p>Code set information from servers is embedded in IORs. A client with
an IOR from a server should therefore know what transmission code sets
the server supports. This approach can fail for two reasons:</p><ol class="enumerate" type=1><li class="li-enumerate">
A <span style="font-family:monospace">corbaloc</span> URI (see chapter&#XA0;<a href="omniORBpy007.html#chap%3Ains">7</a>) does
not contain any code set information.</li><li class="li-enumerate">Some badly-behaved servers that do support code set conversion
fail to put codeset information in their IORs.
</li></ol><p>The CORBA standard says that if a server has not specified
transmission code set information, clients must assume that they only
support ISO-8859-1 for char and string, and do not support wchar and
wstring at all. The effect is that client code receives
<span style="font-family:monospace">DATA_CONVERSION</span> or <span style="font-family:monospace">BAD_PARAM</span> exceptions.</p><p>To avoid this issue, omniORB allows you to configure <span style="font-style:italic">default</span>
code sets that are used as a server&#X2019;s transmission code sets if they
are not otherwise known. Set <span style="font-family:monospace">defaultCharCodeSet</span> for char and
string data, and <span style="font-family:monospace">defaultWCharCodeSet</span> for wchar and wstring data.</p>
<h2 id="sec116" class="section">8.3&#XA0;&#XA0;Code set library</h2>
<p>To save space in the main ORB core library, most of the code set
implementations are in a separate library. To load it from Python, you
must import the <span style="font-family:monospace">omniORB.codesets</span> module before calling
<span style="font-family:monospace">CORBA.ORB_init()</span>.</p>
<h2 id="sec117" class="section">8.4&#XA0;&#XA0;Implementing new code sets</h2>
<p>Code sets must currently be implemented in C++. See the omniORB for
C++ documentation for details.</p>
<hr>
<a href="omniORBpy007.html"><img src="previous_motif.svg" alt="Previous"></a>
<a href="index.html"><img src="contents_motif.svg" alt="Up"></a>
<a href="omniORBpy009.html"><img src="next_motif.svg" alt="Next"></a>
</body>
</html>
